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1.  Introduction  and  Statement  of  Problem 

We  consider  a  stochastic  linear  system  with  additive  "noise”  and  additive  input  which  is 
under  our  control.  The  controlled  process  is  described  by  a  stochastic  differential  equation 

dx{t)  =  ax(t)dt  +  adw(t)  +  du[t ), 

(i.i) 

x(0)  =  x. 

Here  x(t)  e  R1  represents  the  coordinate  of  the  system,  a  >  0  and  a  are  constants,  w{t) 
is  a  standard  Wiener  process  on  (fi,  J,  Jt,P)  and  r/(t)  is  Jt -adapted  process  of  bounded 
variation. 

The  running  cost  is  described  by  a  function  g(x ,  t)  and  the  terminal  cost  by  the  function 
G'(x) .  A  constant  c  >  0  represents  a  unit  cost  of  input.  The  objective  is  to  find 


lin  E  |  j  g{x{t),t)dt  +  cu(T)  +  G(x(T))  j 


where  minimum  is  taken  over  all  ^-adapted  processes  v  of  finite  expected  variation. 

Parallel  to  the  above  stochastic  problem,  we  consider  a  deterministic  control  problem 


dy(t)  =  ay(t)dt  +  dU(t) 
3/(0)  =  x. 

with  an  objective  to  find 


™n(  /  g[y{t),t)dt  +  cU{T)  +  G{y[T)). 
u  Jo 


It  will  be  shown  that  there  exists  an  optimal  path  y*(-)  such  that,  whatever  is  the  initial 
state,  the  optimal  policy  consists  of  following  this  path  exerting  minimal  control  necessary 
for  that. 


In  stochastic  problem  the  optimal  policy  looks  similar  to  the  deterministic  one.  It  is 
necessary  to  follow  y*(-)  as  close  as  possible.  The  optimal  policy,  however,  in  this  case  does 
not  exist,  because  the  control  which  forces  a  Brownian  motion  into  a  deterministic  path  is 
of  unbounded  variation. 

We  will  also  consider  deterministic  and  stochastic  problems  with  bounded  control  rates. 
In  these  problems  U  is  subject  to 


U[t)  =  f  u(s)ds  with  |u(s)[<M. 
Jo 


It  will  be  shown  that  when  M  —*  oo  the  optimal  cost  in  these  problems  converges  to  the 


optimal  cost  of  the  original  problem.  The  optimal  control  is  bang-bang  that  is  u  is  equal 
to  either  +M  or  —M. 


It  is  interesting  to  contrast  our  results  with  the  discrete-time  analog  of  this  problem 
treated  in  Bes  and  Sethi  [1987].  While  in  both  cases,  it  is  possible  to  obtain  equivalent  deter¬ 
ministic  problems  there  are  certain  important  differences  between  them.  In  the  descrete-time 
case,  the  optimal  feedback  control  can  be  explicitly  constructed  from  the  optimal  control 
of  the  equivalent  deterministic  problem  and  the  optimal  state  trajectory  arising  from  the 
feedback  control  is  not  deterministic  in  general.  In  the  continuous-time  case,  on  the  other 
hand,  the  optimal  state  trajectory  is  deterministic  in  general  and  there  exists  no  optimal 
policy  yielding  that  trajectory. 

The  paper  is  structured  as  follows.  In  Section  2  we  study  the  deterministic  problems 
and  find  the  equation  for  the  optimal  path  y*(-).  We  show  that  the  optimal  cost  of  the 
bounded  control  rate  problem  converges  to  the  optimal  cost  (1.4).  In  Section  3  we  prove 
that  the  optimal  cost  (1.2)  is  equal  to  that  of  (1.4)  and  we  construct  an  e -optimal  policy  by 
keeping  the  controlled  process  within  a  narrow  strip  around  y*(-)  and  reflecting  it  at  the 
boundaries  of  the  strip. 


2.  Deterministic  model. 

We  start  with  a  controlled  process  governed  by  the  following  equation 

y(t)  =  x+  f  ay(s)ds  +  U{t ),  0  <  t  <  T  (2.1) 

Jo 

Here  a  is  a  constant  and  U(t),t  <  T  is  a  right  continuous  process  of  bounded  variation. 
We  denote  the  set  of  all  such  processes  by  A . 

Let  G(x)  be  a  nonnegative  continuously  differentiable  strictly  convex  function  such  that 

G*(x)  00  as  |z|  — >  oo.  (2.2) 

Let  g(x,t)  be  a  twice  continuously  differentiable  function  of  two  arguments  such  that 
there  exist  constants  ei,C2  >  0  such  that 


^  d2g{x,t) 

c,~  ax*  ' 

(2.3) 

d29{x,t)  ^ 
dxdt  ~~  °2 

(2.4) 

With  each  UtA  we  associate  a  cost  funtional 


JX(U)=  fT  g(y{t),t)dt  +  G(y{T))  +  cU(T)  (2.5) 

Jo 

The  objective  is  to  find 


v(x)  =  min  JX{U) 

UtA 


(2.6) 


and  U*  such  that 


Let  y(t)  be  any  trajectory  given  by  (2.1).  Consider  it  as  a  continuous  contour  Y  in  a 
two  dimensional  plane  R2  =  (y,t)  (If  y(s)  ^  y(s— )  then  we  connect  the  points  (y(s-),s) 
and  (y(s),s)  with  a  segment).  Then,  using  (2.1)  for  representing  U(t), 


JX(U)  =  J  (g(y,t)  -  cayjdt  +  c  J  dy  +  cx  +  G(y(T)).  (2.8) 

Let  U i  and  C/2  be  two  control  functional  which  yield  trajectories  yi  and  y2  such  that 
yi(T)  =  y2(T).  Assume  for  a  moment  that  yi{t)  >  y2(0  for  all  t  <  T  and  let  S  be  a 
closed  region  formed  by  the  contours  Yi  and  y2.  Then,  by  virtue  of  (2.8) 


Jx[Ui)  -  JX{U2)  -  j>  (g(y,t)  ~  cay)dt  +  j>  cdy  =  J  J  -  cotdtdy.  (2.9) 

(The  last  equality  in  (2.9)  is  due  to  Green’s  formula.  Note  that  §  stands  for  the  integral 
taken  in  the  countercloclwise  direction).  Formula  (2.9)  suggests  the  equation  for  the  optimal 
trajectory. 

Denote  y(s)  to  be  as  a  function  for  which 


!^(y(«M  =  ca- 


(2.10) 


By  virtue  of  (2.3)  formula  (2.10)  uniquely  determines  y(s)  for  each  s.  In  view  of  (2.4) 


dy{s) 

d2g[y[s),s)  /32y(y(s),s) 

ds 

dxdt  /  dx 2 

(2.11) 


In  the  remainder  of  this  section  we  will  prove  that  y(s)  determined  by  (2.10)  represents  the 
optimal  trajectory. 


(2.12)  Theorem .  Let  y(0)  be  determined  by  (2.10)  and  let  a  be  the  (unique)  solution  of 


G'(a)  =  c. 


Then  the  optimal  control  U*  is  given  by  the  formula 


(2.13) 


lT(t)  =  y(t)-*- 


+  lt=r(a  -  y(t))> 


The  optimal  trajectory  y*  is  then 


(2.14) 


if  t  <  T, 
if  t  =  T. 


(2.15) 


Proof.  First  notice  that  the  strict  convexity  of  G  and  (2.2)  implies  existence  and 
uniqueness  of  the  solution  of  (2.13).  Also,  a  simple  calculation  shows  that  (2.15)  follows 
from  (2.14). 

Note  that  the  policy  U*  moves  the  controlled  process  instantaneously  from  x  to  y(0) , 
then  follows  the  trajectory  y(-) ,  and  at  the  moment  T  moves  the  process  instantaneously 
to  point  a. 

Consider  the  contour  Y *  associated  with  the  trajectory  y*  (to  be  specific  we  assume 
x  <  y(0)  and  a  <  y(T)) 


Y*  =  {(y, t)  :  y  =  y(t), 0  <  t  <  T}u{(y,t)  :  t  =  0,i  <  y  <  y(0)} 

C{(y,0;f  =  T,a<y<  y(T)} 

The  contour  Y*  consists  of  the  graph  of  the  function  y  and  two  segments  one  connecting 
the  initial  point  x  and  y( 0),  the  second  connecting  a  and  y[T). 

Let  U  be  any  other  control  and  y  be  the  corresponding  trajectory.  Suppose  y(T)  /  a. 


Consider 


Ut(t)  =  U(t)  +  lt=T(a-y(T)) 


Then 


MU)  -  MUi)  =  G[y(T))  -  G(a)  -  c(y(r)  -  a) 


(2.16) 


In  view  of  (2.13)  and  stric  convexity  of  G ,  the  right  hand  side  of  (2.16)  is  strictly  positive. 
Therefore  we  may  consider  only  those  controls  U  and  the  corresponding  trajectories  y  for 
which 


y(T)  =  a. 


(2.17) 


Let  Y  be  the  contour  associated  with  y.  This  contour  consists  of  the  graph  of  the  func¬ 
tion  y(-)  and  the  vertical  segments  counnecting  the  discontinuities  of  this  graph  (including 
the  segment  connection  x  with  y(0)).  Using  (2.8),  we  can  write 


Mu )  -  Mu *)  =  f  Kff(y>*)  -  cay)dt  +  cdy] 

Jy 

-  \{g{y,t)  -  cay)dt  +  cdy] 

Jy • 

=  /  (y(y^)  -  cay)dt  -  /  (g(y,t)  ~  cocy)dt 

Jy  Jy* 


(2.18) 


The  last  equality  in  (2.18)  is  due  to  the  fact  that  Jy  cdy  =  Jr.  cdy  =  c(a  —  x) . 

Assume  that  there  exist  k  >  1  and  0  =  to  <  *2  <  •  •  •  <  tk  =  T  such  that  y(i,-~)  < 
y*{h)  <  y{ti)  and  y*(s)  —  y(s)  does  not  change  sign  on  (t,_i,t,),i  =  1,2,...,/:.  The 
latter  means  that  contours  Y  and  Y*  have  intersection  at  the  points  (y*(t,),tt)  and  on  any 
interval  [U-i,ti)  the  graph  of  the  function  y(-)  does  not  intersect  the  graph  of  the  function 
y*(-)  so  it  is  located  above  (or  below)  the  graph  of  y*(-).  (The  case  in  which  k  is  infinite 
is  considered  similarly.) 


Let  the  set  of  integers  I\  (the  set  I2)  be  the  set  of  all  i  for  which  y(s )  >  y*(s)(y(s)  < 
!/*(«))  for  s  e  (f,_i,t»).  Let  dSi  be  a  closed  loop  formed  by  m(Rx  [*,-i,t*])  and  the 
part  of  Vn(R  x  [tj-x.ti])  which  lies  above  (below)  of  y*n(Rx[ti_1,t,])  if  i  e  Ix  (if  iel2). 
Note  that  if  y(ti-i-)  =  y(t*_  1)  and  y(t<-)  =  y(U)  then  dSi  =  (Y  uy)  n  (R  x  [ti-i.t*]). 
Let  S{  be  the  set  enclosed  by  dSi . 

Using  (2.18),  we  can  write 


(2.19) 


Using  Green’s  formula  we  transform  (2.19)  into 


1 

p 


t 

i 

1 

i 


dg{y,t) 

dx 


cadydt  -  E  /  /s  -  caiydt. 


(2.20) 


In  view  of  (2.3)  is  an  increasing  function  of  z,  hence  >  ca  for  all  y  >  y(t ) .  Since 

y  ^  y(0  every  (y,  t)  e  Si  such  that  tc/j  and  0  <  t  <  T,  we  get  nonnegativity  of  every 
integrand  in  the  first  sum  in  (2.20).  Likewise  every  integrand  in  the  second  sum  in  (2.20)  is 
nonpositive.  The  later  implies 


which  proves  the  theorem. 


MU) -Mu*)  >0, 


(2.21)  Corollary.  The  optimal  cost  u(z)  is  given  by  the  formula 


v(x)  =  c(a  -  x)  +  G(a )  +  f 

Jo 


g[y{t),t)dt 


Let  Am  be  the  set  of  all  U  tA  subject  to  (1.5).  Denote 


vm{x)  =  sup  Jx{U) 
UtAt4 


It  is  obvious  that  vm[x)  is  an  increasing  function  of  M  and  vm(x)  <  v(x). 
Let 


f  min{t  :  x  +  Mt  =  y(t)},  if  x  <  y( 0), 
\  min{t  :  x  -  Mt  =  y(t)},  if  x  >  y(0), 


Kf  _  (  max{t  :a  +  (T  ~  t)M  =  y(f)},  if  a  <  y(T), 
2  \  max{*  :  a  -  (T  ~  t)M  =  y(f)},  ifa>y(r). 


Let  N  be  such  that  for  each  M  >  N 


M 


<  T. 


M 


Let  =  amax(|y(<)|,0  <  t  <  T)  +  ct/c 2,  where  ci,c2  are  given  by  (2.3),  (2.4).  For 
any  M  >  Ni  V  N  put 


if  t  <  t“, 
if  TjM  <  t  < 
if  r2M  <t  <T. 

By  virtue  of  (2.11) 


{M  sign  (y(0)  -  x), 
M  sign  (a-y(T), 


u(s)ds 


t  Am 


It  is  easy  to  see  that  rf*  — »  0  and  — ♦  T  as  M  — *  00 .  Hence 


8 


*4(^a {)  =c(a  -  x)  +  G(a)  +  [  g(y(t),t)dt 

Jr? 

+  f  g(x  ±  Mt,t)dt  +  f  g(a  ±  M[T  —  t),t)dt 

Jo  Jr7 

f T » 

—  /  cay(t)dt  —*  v(x). 

Jrt 


The  latter  shows 


vM  (x)  — ►  v(x)  as  M  —*  oo. 


3.  Stochastic  case. 


Let  V  stand  for  the  set  of  ail  ^-adapted  processes  u  with 


£{M(T)>  <  «> 


where  \u\  stand  for  the  variation  of  the  process  u(-) .  For  each  u  e  V  we  define  the  process 
x(-)  satisfying  the  following  equation 


x{t)  =  x  + 


/  ax(s)d 

Jo 


s  +  aw(t)  +  t'(t), 


where  o  >  0,  a  is  the  same  as  in  section  2  and  w(t)  is  a  standard  Wiener  process  adapted 
to  ?t .  With  each  v  e  V  is  associated  the  following  cost 


Jx{u)  =  £  j  J  g(x[t),t)dt  +  G[x[T))  +  cv(T) 


Similarly,  we  define 


F{x)  =  inf  Jx\v). 

l/fr 


Let  V\f  stand  for  all  u  t  V  such  that 


9 


u(t)  =  f  r}{s)ds,  |»j(s)|  <  M  for  all  0  <s<T, 

Jo 

(3.5) 

and 

Fm[x)  =  inf  Jx{y). 

t/tVM 

(3.6) 

(3.7)  Theorem.  For  every  x 

F(x)  >  v(x) 

(3.8) 

Fm[x)  >  vM(x) 

(3.9) 

Proof.  For  u  e  V  out 

Uu{t)  =  E{u{t)} 

(3.10) 

By  virtue  of  (3.1)  the  right  hand  side  of  (3.10)  is  finite.  Also  if  0  =  to  <  t\  <  . , 

then 

k  k 

E  = E  i£M‘<)  - 

tssl  X=1 

k  k 

<  E^{IK‘.)  <  |s{E  MM  -  H('i-i)}  = 

i—1  »=1 

This  shows  that  \U„\{T)  is  finite.  Let  i(t)  be  given  by  (3.2).  Let  yu[t)  satisfies  (2.7) 
with  U  =  Uu.  It  is  obvious  that  y„(t)  =  E{x(t)}.  By  Jensen’s  inequality  E{g(x(t),t)}  > 
g{yi,{t),t)  and  E{G(x(T)}  >  G(y„(T)),  therefore 


Jx{is)  =  fT  E{g{x(t),t)}dt  +  £{G(x(r)}  +  E{aw(T)} 

Jo 

+cE{v(T)} 

(s.n) 

>  f  9{yu{t),t)dt  +  G(yu(T))  +  cUv(T) 

Jo 

=Mu„) 
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Inequality  (3.11)  implies  (3.8).  The  proof  of  (3.9)  is  similar. 

Let  y[t)  be  the  function  defined  by  (2.10)  and  let  a  be  given  by  (2.13).  Without  loss 
of  generality  we  can  assume  x  <  y(0)  and  a  <  y(T).  Fix  e  >  0.  Let  yi(t)  and  y2(t)  be 


three  times  continuously  differentiable  functions  such  that 

y(0  -  e  <  yx(f)  <  y(t)  <  y2(0  <  y(0  +  c,  if  e  <t  <  T  -  e,  (3.12) 

yx(0)=a-e,  y2(0 )  =  a  +  c,  (3.13) 

yi,2(0  =  yi,2 (0)  +  t(y  1.2(e)  —  yi,2(0))  if  ,  Q  <  t  <  e,  (3.14) 

yi(T)=a-e,  y2(T)  =  a  +  e:,  (3.15) 

yi,a(0  =  Vi,*{T  -e)  +  {T-  t){yi.2(T)  -  y1<2(T  -  e)),  if  T-e<t<T.  (3.16) 


The  graphs  of  yi[t)  and  y2(0  form  a  "tube”  of  the  width  not  exceeding  2c.  This  tube 
encloses  the  initial  point  x,  the  endpoint  a,  and  on  the  interval  (c, T  —  e)  it  contains  the 
graph  of  y(t).  Construction  of  such  functions  yt(-)  and  y2(  )  is  rather  elementary  and  we 
omit  it. 

Let  kt[t)eV  be  a  functional  such  that 

xt{t)  =  x  +  f  axt(s)ds  +  aw(t)  +  kt(t),0  <  t  <  T,  (317) 

Jo 

yt(0  <  xt(t)  <  y2{t)  for  all  0  <  t  <  T,  (3.18) 

*«(0  =/  lVl(,){xAs))d\ke\{s)  -  l  lyAt]{xt[s))d\kt\{s)  (3.19) 

Jo  Jo 

The  functional  kt(-)  is  the  so  called  solution  of  the  Skorokhod  problem  for  the  Brownian 
motion  with  drift  ax  and  diffusion  a .  Its  effect  results  in  reflection  of  the  Brownian  motion 
from  the  time  dependent  boundaries  yi(-)  and  y2(-)-  The  existence  of  such  a  functional 
follows  easily  from  Lions  and  Sznitman  [1984]. 


(3.20)  Theorem.  As  c  — »  0 


.-**  jr.»i 


.14  .*| 


Proof.  Let  D  =  max(|a|,  x  ,sup{|y(s)|,0  <  5  <  T})  +  1  and  let 


“ax  |y(y,*)|» 

\«\<D 

0  <t<T 

s=.  .1  ,1^  ^  \a[xut)  - g{x2,t)\, 

0<t<T 

=  max  |G(y)  -  G(o)|. 

|y-a]<* 

Then 


Jx(kt)  =  E{J*  g(xt{s)ts)ds]  +  E{G(xt(T))}  +  cE{kt(T )}  =h  +  I2  +  h 


Consider 


\h- f  g{y[s)’s)ds\<\E  f  0(*«(*),«)<J*| 

Jo  Jo 

+  1/  ff(y(s)i«)«^|  +  \E  f  g{xt{s),s)ds\+  [  g{y(s),s)ds\  (3.22) 

'Jo  Jt-€  Jr-t 

+  Eif'  \g{xg{s),s)ds  -  g(y[s),s)\ds} 

In  view  of  (3.12)-(3.16)  and  (3.18),  |ze(s)j  <  D  if  e  <  1.  Therefore,  each  of  the  four  terms 
in  the  right  hand  side  of  (3.22)  does  not  exceed  Ne.  Applying  (3.18)  to  the  integrand  in 
the  last  term  of  (3.22),  we  see  that  it  does  not  exceed  6.  Therefore,  (3.22)  does  not  exceed 
4 Ne  +  T6 .  Since  6  —*  0  as  e  — »  0,  the  right  hand  side  of  (3.22)  converges  to  0. 

By  virtue  of  (3.15)  and  (3.18),  \xt(T)  —  a|  <  e.  Thus, 


|/2  -  G(a)  |  <  <5i  — >  0  as  e  — >  00. 


(3.23) 


Formula  (3.17)  shows 


E{kt{T)}  =  E{x,{T)}  -x-  e{J*  ax,(s)(£s}.  (3.24) 

Formula  (3.15)  and  (3.18)  show  that  the  first  term  in  the  right  hand  side  of  (3.24) 
converges  to  a.  Likewise,  using  (3.12)-(3.16)  and  (3.18),  one  can  show  that  the  last  term  in 
the  right  hand  side  of  (3.24)  converges  to  JQ  ay[s)ds.  Therefore  (3.24)  converges  to  U*(T ) . 
This  fact  along  with  (3.23)  and  the  convergence  of  (3.22)  to  zero  proves  (3.21). 

(3.25)  Corollary.  F(x)  =  v(x) . 

The  proof  follows  from  Theorem  (3.7)  and  Theorem  (3.20). 

Let  yi(t)  and  3/2 (^)  satisfy  (3.12)-(3.16).  Consider  the  process  xe,M{s )  defined  by  the 
following  stochastic  differential  equation 

dxeM[t)  =axeM(t)dt  +  Ml*.  M(t)<yi(t)df 

+  adw{t ), 

x<,A*(0)  =  x. 


Let 


Vc,m{s) 


M,  if  xeM{s)  <  yi(s), 
-M,  if  ztf,A/(s)  >  y2(s). 


and  i/«,A*(t)  =  /q  T]e,M[s)ds.  It  is  obvious  that  e  Vm  and  is  the  solution  of  (3.2) 
with  v  =  ue,M  •  Simple  calculations  show  that 


(3.26)  Remark .  Although  we  have  identified  trajectory  y*(-)  which  is  optimal  for  both 
deterministic  and  stochastic  cases,  there  is  no  optimal  policy  in  the  latter  case.  Any  func¬ 
tional  which  keeps  Brownian  motion  "stuck”  to  a  deterministic  trajectory  has  a.s.  infinite 
variation  on  any  finite  interval. 


WWW 


REFERENCES 


Bertsekas,  D.  P.  (1986).  Dynamic  programming  and  stochastic  control.  Academic  Press, 
New  York. 

Bes,  C.  and  Sethi,  S.  (1987).  Solution  of  a  Stochastic  linear-convex  control  problems  using 
deterministic  equivalents.  To  appear  in  JOTA. 

Fleming,  W.  M.  and  Rishel,  R.  W.  (1975).  Deterministic  and  stochstic  optimal  control. 
Springer,  New  York. 

Harrison,  J.  M.  and  Taksar,  M.  I.  (1983).  Instantaneous  control  of  Brownian  motion.  Math 
of  Oper.  Res..  8,  439-453. 

Lions,  P.  L.  and  Sznitman,  A.  S.  (1984).  Stochastic  differential  equation  with  reflection 
boundary  conditions.  Comm.  Pure  and  Appl.  Math.  37,  511-537. 

Mendaldi,  J.  L.  and  Taksar,  M.  I.  (1987).  Singular  control  of  multidimensional  Brownian 
motion.  To  appear  in  Proceedings  of  the  Xth  IFAC  Congress,  Munich,  Germany  1987. 

Sethi,  S,  and  Thompson,  G.  L.  (1981).  Optimal  control  theory:  Application  to  management 
science.  Martinus  Nijhoff  Publishing  Co..  Boston. 

Taksar,  M.I.  (1987).  Singular  control  in  a  multidimensional  space  with  control  costs  pro¬ 
portional  to  displacement,  Proceedings  of  the  International  Conference  on  Optimization, 
Singapore,  April  1987,  314-323. 


TfTlC- 


